Skip to content

proxy: add support for anthropic v1/messages api#417

Merged
mostlygeek merged 2 commits intomainfrom
support-anthropic-api
Nov 30, 2025
Merged

proxy: add support for anthropic v1/messages api#417
mostlygeek merged 2 commits intomainfrom
support-anthropic-api

Conversation

@mostlygeek
Copy link
Owner

@mostlygeek mostlygeek commented Nov 30, 2025

llama.cpp now supports the anthropic /v1/messages api (ggml-org/llama.cpp#17570). This commit adds support for it in llama-swap.

Since llama-swap only needs the model parameter this was a fairly trivial addition.

Summary by CodeRabbit

  • Refactor

    • Optimized internal routing for API endpoints (chat completions, text completions, messaging, embeddings, reranking, audio) by switching to an updated request handler — no change to external behavior.
  • Bug Fix

    • Loading-status streaming now only appears for chat-completion requests, preventing unrelated endpoints from emitting loading indicators.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 30, 2025

Walkthrough

Renamed the request handler in proxy from proxyOAIHandler to proxyInferenceHandler and updated route registrations to use it. Adjusted proxy request streaming: loading-state streaming is now gated to paths starting with /v1/chat/completions only.

Changes

Cohort / File(s) Summary
Handler rename & route updates
proxy/proxymanager.go
Renamed method (*ProxyManager).proxyOAIHandler(*ProxyManager).proxyInferenceHandler; updated route registrations to use the renamed handler for /v1/chat/completions, /v1/completions, /v1/messages, /v1/embeddings, /reranking, /rerank, /v1/rerank, /v1/reranking, /infill, /completion, and /v1/audio/speech.
Streaming gating change
proxy/process.go
Modified ProxyRequest flow to stream loading-state only when SendLoadingState is enabled, isStreaming is true, AND the request path starts with /v1/chat/completions; added a log path for when streaming is disabled for other endpoints.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Review route registrations in proxy/proxymanager.go to ensure no endpoints were missed or miswired.
  • Verify renamed handler has no remaining references to the old name and that tests (if any) are updated.
  • Inspect proxy/process.go streaming gating logic to confirm intended behavior for non-chat endpoints and logging.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding support for the Anthropic v1/messages API endpoint, which is reflected in the routing updates and handler registrations in the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch support-anthropic-api

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
proxy/process.go (1)

510-518: Clarify the debug log to reflect all gating conditions for the loading-state SSE

The isChatCompletions guard correctly limits OAI-shaped loading-state SSE to /v1/chat/completions, which is appropriate for Anthropic /v1/messages compatibility. However, the else-branch log fires for three different conditions (SendLoadingState unset/false, non-streaming, or non-chat-completions path) but only mentions SendLoadingState, making debugging harder.

Update the log message to reflect all gating conditions:

-            p.proxyLogger.Debugf("<%s> SendLoadingState is nil or false, not streaming loading state", p.ID)
+            p.proxyLogger.Debugf(
+                "<%s> loading-state streaming disabled (SendLoadingState=%v, streaming=%v, isChatCompletions=%v)",
+                p.ID,
+                p.config.SendLoadingState != nil && *p.config.SendLoadingState,
+                isStreaming,
+                isChatCompletions,
+            )

Per project guidelines, run make test-dev after this change to verify static analysis and tests pass.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5befa0a and 4e86239.

📒 Files selected for processing (1)
  • proxy/process.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
proxy/**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

Run make test-dev when making iterative changes to code under the proxy/ directory - this runs go test and staticcheck, and all static checking errors must be fixed

Files:

  • proxy/process.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: run-tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant